AITopics | orthogonal transformer

We present a general vision transformer backbone, called as Orthogonal Transformer, in pursuit of both efficiency and effectiveness. A major challenge for vision transformer is that self-attention, as the key element in capturing long-range dependency, is very computationally expensive for dense prediction tasks (e.g., object detection). Coarse global self-attention and local self-attention are then designed to reduce the cost, but they suffer from either neglecting local correlations or hurting global modeling. We present an orthogonal self-attention mechanism to alleviate these issues. Specifically, self-attention is computed in the orthogonal space that is reversible to the spatial domain but has much lower resolution.

efficient vision transformer backbone, name change, orthogonal transformer, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization A Proof of Theorem 1

Neural Information Processing SystemsAug-15-2025, 04:23:09 GMT

Herein we provide the proof of Theorem 1 in the main text. Proof A.2 We can construct the Householder matrix with vector u = Q is the product of n 1 orthogonal Householder matrices. Proof A.5 With Lemma A.3, we can upper triangularize the given real orthogonal matrix A as: H We train the models with two common settings: "1 The AdamW optimizer is used with learning rate of 0.0001, weight decay of 0.05 and batch-size of 16. We apply Orthogonal Transformer pretrained on ImageNet-1K as the backbone network. I and Fig.II show the detailed architectures of the convolutional patch embedding and the The last convolution is with the kernel-size of 1 1, following by a LayerNorm layer.

convolution, transformer, variant, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

5d8c01de2dc698c54201c1c7d0b86974-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 04:23:06 GMT

orthogonal transformer, transformer, vision transformer, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization

Neural Information Processing SystemsOct-11-2024, 06:21:38 GMT

We present a general vision transformer backbone, called as Orthogonal Transformer, in pursuit of both efficiency and effectiveness. A major challenge for vision transformer is that self-attention, as the key element in capturing long-range dependency, is very computationally expensive for dense prediction tasks (e.g., object detection). Coarse global self-attention and local self-attention are then designed to reduce the cost, but they suffer from either neglecting local correlations or hurting global modeling. We present an orthogonal self-attention mechanism to alleviate these issues. Specifically, self-attention is computed in the orthogonal space that is reversible to the spatial domain but has much lower resolution.

efficient vision transformer backbone, orthogonal transformer, token orthogonalization, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Filters

Collaborating Authors

orthogonal transformer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

5d8c01de2dc698c54201c1c7d0b86974-Supplemental-Conference.pdf

5d8c01de2dc698c54201c1c7d0b86974-Paper-Conference.pdf

Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization

Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization A Proof of Theorem 1

5d8c01de2dc698c54201c1c7d0b86974-Paper-Conference.pdf

Orthogonal Transformer: An Efficient Vision Transformer Backbone with Token Orthogonalization